Storage - Content Validation Decoder Implementation#47016
Storage - Content Validation Decoder Implementation#47016gunjansingh-msft wants to merge 30 commits intofeature/storage/content-validationfrom
Conversation
API Change CheckAPIView identified API level changes in this PR and created the following API reviews |
kyleknap
left a comment
There was a problem hiding this comment.
It's looking good! It's great to see us getting a working end to implementation of downloads. I just left some higher-level comments as we keeping hashing out the download implementation.
ibrandes
left a comment
There was a problem hiding this comment.
i will be 100% honest - i was really struggling to walk through the logic in here, there's so much to look at x) my comments should highlight some of the more confusing areas. don't worry about polishing anything too much, but adding more subfunctions, centralizing util logic, and reducing the nesting should make this a lot easier to follow.
overall, i think you're going down the right path with how you utilize and set the decoder state. also, definitely look into downloadToFileImpl as mentioned in one of my comments - ProgressListener and ProgressReporter might be able to help us reduce some of the boilerplate code you've had to write.
|
Hi @gunjansingh-msft. Thank you for your interest in helping to improve the Azure SDK experience and for your contribution. We've noticed that there hasn't been recent engagement on this pull request. If this is still an active work stream, please let us know by pushing some changes or leaving a comment. Otherwise, we'll close this out in 7 days. |
kyleknap
left a comment
There was a problem hiding this comment.
It's looking good. I like the overall structure of where the change is going. I have not dived too deep into the PR yet but just wanted to post some of the questions/comments that I had so far as I work through understanding the changes.
All the latest comments are addressed |
…pipelinepolicy Resolve conflicts in BuilderHelper (both decoder and encoder content validation policies) and StorageCrc64Calculator Javadoc. Align download path with ContentValidationAlgorithm rename; fix downloadStreamWithResponseInternal variable mix-up. Delegate CRC64/AUTO detection to ContentValidationModeResolver.isCrc64OrAuto from DownloadValidationUtils. Update decoder tests for setContentValidationAlgorithm; remove MD5 download test (enum no longer includes MD5). Fix incorrect @link in ContentValidationModeResolver Javadoc. Made-with: Cursor
ibrandes
left a comment
There was a problem hiding this comment.
looking good so far! mostly just had a couple comments on util cleanup and code consolidation - your decoder lifecycle handling looks good to me
| StructuredMessageDecoder decoder = new StructuredMessageDecoder(expectedLength); | ||
|
|
||
| Flux<ByteBuffer> decodedStream = decodeStream(httpResponse.getBody(), decoder); | ||
| return new DecodedResponse(httpResponse, decodedStream); |
There was a problem hiding this comment.
here, we are decoding the structured body into a payload body, but we are still wrapping the response with the original Content-Length header. we should add an override to DecodedResponse for the getHeaders method where we remove the Content-Length header:
@Override
public HttpHeaders getHeaders() {
HttpHeaders headers = new HttpHeaders(originalResponse.getHeaders());
headers.remove(HttpHeaderName.CONTENT_LENGTH);
return headers;
}
another option is to use the adjusted Content-Length value, but I'm not sure if we know that until everything is consumed.
There was a problem hiding this comment.
@ibrandes and @gunjansingh-msft I'm not sure if I'm following why we'd remove the Content-Length header all together? My main concern is that using the Content-Length header is a common way to derive the size of blob that is being downloaded and downstream logic might rely on it being present? Might be interesting to check with what .NET does here as well to make sure we are consistent. I do generally agree that it makes sense to override the Content-Length with the value of the x-ms-structured-content-length in order to not break existing flows that leverage content length as the blob size.
There was a problem hiding this comment.
I’ve aligned this with the .NET behavior in the latest revision.
In the .NET SDK, the response body is wrapped via StructuredMessageDecodingStream.WrapStream, but the response headers (including Content-Length) are not modified and continue to reflect the wire payload size.
I’ve removed the Content-Length override here as well, so the Java implementation now matches the .NET and by passing headers through unchanged.
Callers can still get the decoded length via x-ms-structured-content-length, BlobDownloadHeaders.getBlobSize(), or Content-Range if needed.
There was a problem hiding this comment.
adding this as an ADO item to further investigate.
kyleknap
left a comment
There was a problem hiding this comment.
Looking good. Just leaving some more questions that I had as I'm working through fully understanding the implementation. Mainly focused on the policy and the decoded response class. Plan to dig more into the structured message decoder later.
| StructuredMessageDecoder decoder = new StructuredMessageDecoder(expectedLength); | ||
|
|
||
| Flux<ByteBuffer> decodedStream = decodeStream(httpResponse.getBody(), decoder); | ||
| return new DecodedResponse(httpResponse, decodedStream); |
There was a problem hiding this comment.
@ibrandes and @gunjansingh-msft I'm not sure if I'm following why we'd remove the Content-Length header all together? My main concern is that using the Content-Length header is a common way to derive the size of blob that is being downloaded and downstream logic might rely on it being present? Might be interesting to check with what .NET does here as well to make sure we are consistent. I do generally agree that it makes sense to override the Content-Length with the value of the x-ms-structured-content-length in order to not break existing flows that leverage content length as the blob size.
kyleknap
left a comment
There was a problem hiding this comment.
We should be good to merge it into the feature branch as our base point. Just had one question that we can track as a follow up. From there, we can start sending smaller PR improvements to help break up the different iterations we wanted to make and make it easier to track/follow
| } | ||
|
|
||
| @Override | ||
| public void close() { |
There was a problem hiding this comment.
Were we planning to remove this method per this comment? #47016 (comment) Mainly asking, fine to defer it for a follow up task like we are doing for the content length.
There was a problem hiding this comment.
yep - thanks for catching this! looks like i accidentally reverted the removal :,)
No description provided.